-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Rc<T>::deref
and Arc<T>::deref
zero-cost
#132553
base: master
Are you sure you want to change the base?
Conversation
b283c44
to
ae36f44
Compare
This comment has been minimized.
This comment has been minimized.
Would it potentially enable those types to have an ffi compatible ABI? So that they could be returned and passed directly from /to ffi function, like |
This comment has been minimized.
This comment has been minimized.
I think in theory it is possible, at least for sized types, but I am not familiar with how to formally make it so. |
ae36f44
to
0d6165f
Compare
This comment has been minimized.
This comment has been minimized.
0d6165f
to
98edd5b
Compare
This comment has been minimized.
This comment has been minimized.
r? libs |
98edd5b
to
8beb51d
Compare
This comment has been minimized.
This comment has been minimized.
8beb51d
to
d7879fa
Compare
This comment has been minimized.
This comment has been minimized.
d7879fa
to
317aa0e
Compare
@EFanZh Is this ready for review? If so, please un-draft the PR. |
@joboet: The source code part is mostly done, but I haven’t finished updating LLDB and CDB pretty printers. The CI doesn’t seem to run those tests. |
No worries! I just didn't want to keep you waiting in case you had forgotten to change the state. |
f243654
to
1308bf6
Compare
This comment has been minimized.
This comment has been minimized.
Make `Rc<T>::deref` and `Arc<T>::deref` zero-cost Currently, `Rc<T>` and `Arc<T>` store pointers to `RcInner<T>` and `ArcInner<T>`. This PR changes the pointers so that they point to `T` directly instead. This is based on the assumption that we access the `T` value more frequently than accessing reference counts. With this change, accessing the data can be done without offsetting pointers from `RcInner<T>` and `ArcInner<T>` to their contained data. This change might also enables some possibly useful future optimizations, such as: - Convert `&[Rc<T>]` into `&[&T]` within O(1) time. - Convert `&[Rc<T>]` into `Vec<&T>` utilizing `memcpy`. - Convert `&Option<Rc<T>>` into `Option<&T>` without branching. - Make `Rc<T>` and `Arc<T>` FFI compatible types where `T: Sized`.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (1a76f3d): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 2.0%, secondary -0.9%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary -1.2%, secondary -1.9%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResults (primary 0.2%, secondary -0.9%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 772.144s -> 774.372s (0.29%) |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
fab2460
to
320d0c5
Compare
@rustbot ready |
☔ The latest upstream changes (presumably #138155) made this pull request unmergeable. Please resolve the merge conflicts. |
4074802
to
fd02c08
Compare
☔ The latest upstream changes (presumably #138208) made this pull request unmergeable. Please resolve the merge conflicts. |
fd02c08
to
384ea40
Compare
384ea40
to
0bdb018
Compare
Neutral-ish on icounts, improved on cycles, and even shrinks optimized binaries? Nice. |
//! | ||
//! - Making reference-counting pointers have ABI-compatible representation as raw pointers so we | ||
//! can use them directly in FFI interfaces. | ||
//! - Converting `Option<Rc<T>>` to `Option<&T>` with a memory copy operation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this one should optimize to that already with what you've already written here, right?
You could consider adding a codegen test, like
use std::sync::Arc;
#[no_mangle]
pub fn option_arc_as_deref_is_nop(x: &Option<Arc<i32>>) -> Option<&i32> {
// CHECK-LABEL: @option_arc_as_deref_is_nop(ptr
// CHECK: %[[R:.+]] = load ptr, ptr %x
// CHECK: ret ptr %[[R]]
x.as_deref()
}
impl Deref for RcLayout { | ||
type Target = Layout; | ||
|
||
fn deref(&self) -> &Self::Target { | ||
&self.0 | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, if "external" things should only use the inner layout field through this deref, would it be worth putting RcLayout
in a separate module to enforce that with privacy?
(This is one of those places that really wants to be able to just do unsafe struct RcLayout(Layout);
to enforce it that way...)
trait RcLayoutExt { | ||
/// Computes `RcLayout` at compile time if `Self` is `Sized`. | ||
const RC_LAYOUT: RcLayout; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice that we only need one of these, since Rc and Arc can shared the constant 👍
unsafe fn ref_counts_ptr_from_value_ptr(value_ptr: NonNull<()>) -> NonNull<RefCounts> { | ||
const REF_COUNTS_OFFSET: usize = size_of::<RefCounts>(); | ||
|
||
unsafe { value_ptr.byte_sub(REF_COUNTS_OFFSET) }.cast() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ymmv: since you need to cast
in this function anyway, you could consider avoiding the need for the cast-to-unit in the callers of this by having this be something like
unsafe fn ref_counts_ptr_from_value_ptr(value_ptr: NonNull<()>) -> NonNull<RefCounts> { | |
const REF_COUNTS_OFFSET: usize = size_of::<RefCounts>(); | |
unsafe { value_ptr.byte_sub(REF_COUNTS_OFFSET) }.cast() | |
} | |
unsafe fn ref_counts_ptr_from_value_ptr<T: ?Sized>(value_ptr: NonNull<T>) -> NonNull<RefCounts> { | |
unsafe { value_ptr.cast::<RefCounts>().sub(1) } | |
} |
(That ought to simplify the MIR too, since byte_sub
has to cast to NonNull<u8>
then cast back again, but if you cast and can just sub
you avoid that step. Of course the conversions like that are optimized out by LLVM anyway, but...)
/// Get a pointer to the strong counter object in the same allocation with a value pointed to by | ||
/// `value_ptr`. | ||
/// | ||
/// # Safety | ||
/// | ||
/// - `value_ptr` must point to a value object (can be uninitialized or dropped) that lives in a | ||
/// reference-counted allocation. | ||
unsafe fn strong_count_ptr_from_value_ptr(value_ptr: NonNull<()>) -> NonNull<UnsafeCell<usize>> { | ||
const STRONG_OFFSET: usize = size_of::<RefCounts>() - mem::offset_of!(RefCounts, strong); | ||
|
||
unsafe { value_ptr.byte_sub(STRONG_OFFSET) }.cast() | ||
} | ||
|
||
/// Get a pointer to the weak counter object in the same allocation with a value pointed to by | ||
/// `value_ptr`. | ||
/// | ||
/// # Safety | ||
/// | ||
/// - `value_ptr` must point to a value object (can be uninitialized or dropped) that lives in a | ||
/// reference-counted allocation. | ||
unsafe fn weak_count_ptr_from_value_ptr(value_ptr: NonNull<()>) -> NonNull<UnsafeCell<usize>> { | ||
const WEAK_OFFSET: usize = size_of::<RefCounts>() - mem::offset_of!(RefCounts, weak); | ||
|
||
unsafe { value_ptr.byte_sub(WEAK_OFFSET) }.cast() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Can you avoid doing manual layout calculations here?
If you converted to a NonNull<RefCounts>
first, then &raw
can just mention the field to get its pointer, rather than needing to offset_of
and deal in raw bytes.
(I think that'd let you drop the repr(C)
on RefCounts
too, which would be nice. I don't think there should be a need for it -- I don't think any of the logic here really cares whether the strong or weak count is first in memory.)
Currently,
Rc<T>
andArc<T>
store pointers toRcInner<T>
andArcInner<T>
. This PR changes the pointers so that they point toT
directly instead.This is based on the assumption that we access the
T
value more frequently than accessing reference counts. With this change, accessing the data can be done without offsetting pointers fromRcInner<T>
andArcInner<T>
to their contained data. This change might also enables some possibly useful future optimizations, such as:&[Rc<T>]
into&[&T]
within O(1) time.&[Rc<T>]
intoVec<&T>
utilizingmemcpy
.&Option<Rc<T>>
intoOption<&T>
without branching.Rc<T>
andArc<T>
FFI compatible types whereT: Sized
.